A Category Oriented Web Search Engine Based on Round Robin Learning and Ranking Algorithm
نویسنده
چکیده
Objectives/Goals I hoped to construct a category oriented web search engine through Round Robin learning and ranking algorithm. Such a search engine was designed to classify and rank the Web pages efficiently and to produce more effective search results than existing search engines. Methods/Materials More than 40,000 Web pages were loaded into a group of databases and stored as the URLs and the hits of terms in these Web pages. 500 URLs of these Web pages were indexed by the aggregate measure of subject and related keywords as training and testing data, which were used for calculation of the optimal decision boundary and Euclidean distance. A program was developed based on the algorithms, which allows users to automatically classify and rank the stored Web pages according their search queries and their selection of category. Results The search engine has shown the effectiveness in categorizing search results and ranking the relevance of returned Web pages to a search query through sufficient experimental data. Experimental data also demonstrated that the data structure based on the new classification and ranking algorithm has resulted in satisfying system performance with little cost in terms of data storage space and search speed. Conclusions/Discussion Through Round Robin Learning, a category oriented search engine can be constructed technically and economically. The study has shown the effectiveness of the search engine at an inexpressive cost compared with existing commercial search using link-based algorithms. Such a category oriented search engine has potential in uses of hunting terrorists and academic research. I plan to expand this project by adding more categories and testing the search engine under other conditions such as varying the user groups and the volume of Web pages.
منابع مشابه
Web pages ranking algorithm based on reinforcement learning and user feedback
The main challenge of a search engine is ranking web documents to provide the best response to a user`s query. Despite the huge number of the extracted results for user`s query, only a small number of the first results are examined by users; therefore, the insertion of the related results in the first ranks is of great importance. In this paper, a ranking algorithm based on the reinforcement le...
متن کاملRRLUFF: Ranking function based on Reinforcement Learning using User Feedback and Web Document Features
Principal aim of a search engine is to provide the sorted results according to user’s requirements. To achieve this aim, it employs ranking methods to rank the web documents based on their significance and relevance to user query. The novelty of this paper is to provide user feedback-based ranking algorithm using reinforcement learning. The proposed algorithm is called RRLUFF, in which the rank...
متن کاملA New Hybrid Method for Web Pages Ranking in Search Engines
There are many algorithms for optimizing the search engine results, ranking takes place according to one or more parameters such as; Backward Links, Forward Links, Content, click through rate and etc. The quality and performance of these algorithms depend on the listed parameters. The ranking is one of the most important components of the search engine that represents the degree of the vitality...
متن کاملTowards Supporting Exploratory Search over the Arabic Web Content: The Case of ArabXplore
Due to the huge amount of data published on the Web, the Web search process has become more difficult, and it is sometimes hard to get the expected results, especially when the users are less certain about their information needs. Several efforts have been proposed to support exploratory search on the web by using query expansion, faceted search, or supplementary information extracted from exte...
متن کاملAn Ensemble Click Model for Web Document Ranking
Annually, web search engine providers spend more and more money on documents ranking in search engines result pages (SERP). Click models provide advantageous information for ranking documents in SERPs through modeling interactions among users and search engines. Here, three modules are employed to create a hybrid click model; the first module is a PGM-based click model, the second module in a d...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004